Search CORE

4DXpress: a database for cross-species expression pattern comparisons

Author: A. Brazma
C. Girardot
Christiansen
D. Arendt
Deutsch
E. E. M. Furlong
Grumbling
H. Berube
I. Letunic
J. Gagneur
J. Wittbrodt
M. Kapushesky
P. Bork
P.-D. Weeber
Sprague
T. Henrich
Tomancak
Y. Haudry
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

In the major animal model species like mouse, fish or fly, detailed spatial information on gene expression over time can be acquired through whole mount in situ hybridization experiments. In these species, expression patterns of many genes have been studied and data has been integrated into dedicated model organism databases like ZFIN for zebrafish, MEPD for medaka, BDGP for Drosophila or GXD for mouse. However, a central repository that allows users to query and compare gene expression patterns across different species has not yet been established. Therefore, we have integrated expression patterns for zebrafish, Drosophila, medaka and mouse into a central public repository called 4DXpress (expression database in four dimensions). Users can query anatomy ontology-based expression annotations across species and quickly jump from one gene to the orthologues in other species. Genes are linked to public microarray data in ArrayExpress. We have mapped developmental stages between the species to be able to compare developmental time phases. We store the largest collection of gene expression patterns available to date in an individual resource, reflecting 16 505 annotated genes. 4DXpress will be an invaluable tool for developmental as well as for computational biologists interested in gene regulation and evolution. 4DXpress is available at http://ani.embl.de/4DXpress

MDC Repository

ArrayExpress—a public repository for microarray gene expression data at the EBI

Author: Abeygunawardena N.
Brazma A.
Contrino S.
Coulson R.
Farne A.
Garcia Lara G.
Holloway E.
Kapushesky M.
Lilja P.
Mukherjee G.
Oezcimen A.
Parkinson H.
Rayner T.
Rocca-Serra P.
Sansone S.
Sarkans U.
Sharma A.
Shojatalab M.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

ArrayExpress is a public repository for microarray data that supports the MIAME (Minimum Informa-tion About a Microarray Experiment) requirements and stores well-annotated raw and normalized data. As of November 2004, ArrayExpress contains data from ∼12 000 hybridizations covering 35 species. Data can be submitted online or directly from local databases or LIMS in a standard format, and password-protected access to prepublication data is provided for reviewers and authors. The data can be retrieved by accession number or queried by vari-ous parameters such as species, author and array platform. A facility to query experiments by gene and sample properties is provided for a growing subset of curated data that is loaded in to the ArrayExpress data warehouse. Data can be visualized and analysed using Expression Profiler, the integrated data analysis tool. ArrayExpress is available at http://www.ebi.ac.uk/arrayexpress

CiteSeerX

VisHiC—hierarchical functional enrichment analysis of microarray data

Author: Ashburner
Azhar
Brazma
Chalmel
D. Krushevskaya
Draghici
Eisen
Fan
Ge
Griffiths-Jones
H. Peterson
Hu
Huang
J. Reimand
J. Vilo
Jha
Kapushesky
Kull
Loh
M. Kull
Matys
Okada
Ovaska
Ross
Schena
Tavazoie
Troyanskaya
Vastrik
Publication venue: Oxford University Press
Publication date
Field of study

Measuring gene expression levels with microarrays is one of the key technologies of modern genomics. Clustering of microarray data is an important application, as genes with similar expression profiles may be regulated by common pathways and involved in related functions. Gene Ontology (GO) analysis and visualization allows researchers to study the biological context of discovered clusters and characterize genes with previously unknown functions. We present VisHiC (Visualization of Hierarchical Clustering), a web server for clustering and compact visualization of gene expression data combined with automated function enrichment analysis. The main output of the analysis is a dendrogram and visual heatmap of the expression matrix that highlights biologically relevant clusters based on enriched GO terms, pathways and regulatory motifs. Clusters with most significant enrichments are contracted in the final visualization, while less relevant parts are hidden altogether. Such a dense representation of microarray data gives a quick global overview of thousands of transcripts in many conditions and provides a good starting point for further analysis. VisHiC is freely available at http://biit.cs.ut.ee/vishic

GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

Author: A. C. Culhane
A.-A. St Pierre
Al-Shahrour
B. Haibe-Kains
Bardelli
C. Kelly
D. Gusenleitner
E. N. Martinelli
Fan
G. Papenhausen
J. Quackenbush
K. C. Picard
M. Correll
M. Kapushesky
M. S. Schroder
Mootha
N. O'Connor
R. Sultana
Raychaudhuri
S. C. Picard
W. Flahive
Wu
Wu
Zeeberg
Zhao
Publication venue: Oxford University Press
Publication date: 19/04/2012
Field of study

GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org

Harvard University - DASH

Genome Expression Pathway Analysis Tool – Analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context

Author: A Rosenwald
AA Alizadeh
AI Saeed
B Mlecnik
B Zhang
BM Bolstad
C von Mering
F Al-Shahrour
Gene Ontology Consortium
GJ Dennis
GK Smyth
J Rainer
JM Vaquerizas
Julia C Engelmann
Jörg Schultz
M Kanehisa
M Kapushesky
M Kotera
M Masseroli
M Pelizzola
Markus Weniger
O Troyanskaya
P Khatri
P Lichter
P Shannon
R Gentleman
R Shamir
S Bea
SW Doniger
TJP Hubbard
W Huber
YH Yang
Publication venue: BioMed Central
Publication date: 01/06/2007
Field of study

Abstract Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at <url>http://gepat.sourceforge.net</url>. Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at <url>http://gepat.bioapps.biozentrum.uni-wuerzburg.de</url>.</p

University of Regensburg Publication Server

Gitools: Analysis and Visualisation of Genomic Data Using Interactive Heat-Maps

Author: A Floratos
A Mascarell-Creus
A Sturn
A Subramanian
AI Saeed
B Usadel
BR Zeeberg
BR Zeeberg
Christian Perez-Llamas
D Smedley
DW Huang
DW Huang
G Gundem
I Ferreiro
I Medina
J Chen
J Hou
JN Weinstein
M Ashburner
M Hall
M Kanehisa
M Kapushesky
M Reich
MA Sartor
MA Sartor
MC Whitlock
N Lopez-Bigas
N Lopez-Bigas
Nuria Lopez-Bigas
P Pavlidis
R Shamir
S Holm
Stein Aerts
TJP Hubbard
V Rodilla
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Intuitive visualization of data and results is very important in genomics, especially when many conditions are to be analyzed and compared. Heat-maps have proven very useful for the representation of biological data. Here we present Gitools (http://www.gitools.org), an open-source tool to perform analyses and visualize data and results as interactive heat-maps. Gitools contains data import systems from several sources (i.e. IntOGen, Biomart, KEGG, Gene Ontology), which facilitate the integration of novel data with previous knowledge

CiteSeerX

UPF Digital Repository

EzArray: A web-based highly automated Affymetrix expression array data management and analysis system

Author: A Brazma
BM Bolstad
C Li
C Romualdi
CM Kendziorski
D Rajagopalan
E Hubbell
GK Smyth
H Rehrauer
HM Hsueh
J Rainer
JM Vaquerizas
JM Wettenhall
K Hokamp
L Jones
M Kapushesky
M Psarros
MA Newton
O Larsson
R Diaz-Uriarte
R Edgar
R Ihaka
RA Irizarry
RA Irizarry
S Dudoit
S Vardhanabhuti
S Zhang
VG Tusher
Wei Xu
WK Lim
WM Liu
X Xia
Y Barash
Yuelin Zhu
Yuerong Zhu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from <url>http://www.ezarray.com/</url>.</p

Springer - Publisher Connector

High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

Author: A Brazma
A Brazma
A Campain
BM Bolstad
D Ghosh
DR Rhodes
DR Rhodes
F Hong
GP Srivastava
HK Lee
I Dozmorov
J Hubble
JC Newman
JD Wren
JE Larkin
Jonathan D Wren
L Shi
L Shi
M Kapushesky
M Severgnini
MG Dozmorov
Mikhail G Dozmorov
P Cahan
P Cahan
PK Tan
RA Irizarry
T Bammler
T Barrett
T Konishi
W Fujibuchi
WC Cheng
X Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses. Methods We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses. Results 13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data. Conclusions Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem. Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.</p

Springer - Publisher Connector

OntoCAT -- simple ontology search and integration in Java, R and REST/JavaScript

Author: A Baneyx
B Smith
D Delamarre
DA Lindberg
Despoina Antonakaki
GO Consortium
HA Kestler
Helen Parkinson
HS Pinto
J Bard
J Day-Richter
J Malone
JA Turner
JCA Vega
JD Osborne
K Joeri van der Velde
M Horridge
M Torii
MA Swertz
MA Swertz
Misha Kapushesky
Morris A Swertz
N Sioutos
Natalja Kurbatova
NF Noy
Niran Abeygunawardena
RC Gentleman
RG Côté
Tomasz Adamusiak
Tony Burdett
TR Gruber
TR Gruber
WA Baumgartner
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. Results OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. Conclusions OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. Availability <url>http://www.ontocat.org</url></p

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen